Option to remove data from autocausality object after fit #251

julianteichgraber · 2023-04-12T16:14:09Z

Problem

There are memory issues with autocausality since the fitted estimator automatically stores the data after fitting. DoWhy or Scikit-Learn estimators don't do that and there is no immediate need for it.

Proposed changes

Add option to remove data after fitting in fit method
also removed initialisation of autocausality with data since it is never used. Suffices and would be consistent with DoWhy / Scikit-Learn to only insert data through fit method.

In the future, will consider moving train-test-split to CausalityDataset to always keep test_df available.

Types of changes

What types of changes does your code introduce to Auto-Causality?
Put an x in the boxes that apply

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation Update (if none of the other choices apply)

Checklist

I have read the CONTRIBUTING doc
Description above provides context of the change
I have added tests that prove my fix is effective or that my feature works
Unit tests for changes (not needed for documentation changes)
Bumping version in setup.py is an individual PR and not mixed with feature or bugfix PRs
Commits follow "How to write a good git commit message"
Relevant documentation is updated including usage instructions

EgorKraevTransferwise · 2023-04-14T07:30:25Z

Maybe worth adding a method to delete the data after the fit, for the case when you want to fit, further play with the data, then pickle the fitted model but without the data?

option to remove data after fit

7d4c709

julianteichgraber linked an issue Apr 12, 2023 that may be closed by this pull request

Stop saving copies of the dataset in fitted estimators #218

Open

EgorKraevTransferwise approved these changes Apr 21, 2023

View reviewed changes

AlxdrPolyakov closed this Sep 4, 2024

AlxdrPolyakov deleted the memory-load-reduce branch September 5, 2024 12:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to remove data from autocausality object after fit #251

Option to remove data from autocausality object after fit #251

julianteichgraber commented Apr 12, 2023 •

edited

Loading

EgorKraevTransferwise commented Apr 14, 2023

Option to remove data from autocausality object after fit #251

Option to remove data from autocausality object after fit #251

Conversation

julianteichgraber commented Apr 12, 2023 • edited Loading

Problem

Proposed changes

Types of changes

Checklist

EgorKraevTransferwise commented Apr 14, 2023

julianteichgraber commented Apr 12, 2023 •

edited

Loading